HMM Training and Phrase-Extraction Strategies for PORTAGE

نویسنده

  • George Foster
چکیده

This report describes the results of experiments to determine the optimal training and phrase-extraction settings for HMM-based phrase tables. These were motivated by the implementation of new options for HMM training (bilexical conditioning, corrected end-distribution semantics, etc.) as well as the elimination of the “insect”, which had adversely affected phrase extraction by adding phrase pairs consisting entirely of unaligned words.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Automatic Extraction of Generation Process Model Commands and Its use for Generating Fundamental Frequency Contours for Training HMM-based Speech Synthesis

Generation process model of fundamental frequency (F0) contours can well represent F0 movements of speech keeping a clear relation with linguistic information of utterances. Therefore, by using the model, improvement of HMM-based speech synthesis is expected. One of major problems preventing the use of the model is that the performance of automatic extraction of the model parameters from observ...

متن کامل

Statistical Machine Translation and Automatic Speech Recognition under Uncertainty

Statistical modeling techniques have been applied successfully to natural language processing tasks such as automatic speech recognition (ASR) and statistical machine translation (SMT). Since most statistical approaches rely heavily on availability of data and the underlying model assumptions, reduction in uncertainty is critical to their optimal performance. In speech translation, the uncertai...

متن کامل

Noise-Robust Hidden Markov Models for Limited Training Data for Within-Species Bird Phrase Classification

Hidden Markov Models (HMMs) have been studied and used extensively in speech and birdsong recognition, but they are not robust to limited training data and noise. This paper presents two novel approaches to training continuous and discrete HMMs with extremely limited data. First, the algorithm learns the global Gaussian Mixture Models (GMMs) for all training phrases available. GMM parameters ar...

متن کامل

A targets-based superpositional model of fundamental frequency contours applied to HMM-based speech synthesis

Superpositional model of fundamental frequency (F0) contours as suggested by the Fujisaki model can well represent F0 movements of speech keeping a clear relation with linguistic information of utterances. Therefore, improvement of HMM-based speech synthesis is expected by using the merit of superpositional model. In this paper, a targets-based superpositional model is proposed in the light of ...

متن کامل

PORTAGE: A Phrase-Based Machine Translation System

This paper describes the participation of the Portage team at NRC Canada in the shared task of ACL 2005 Workshop on Building and Using Parallel Texts. We discuss Portage, a statistical phrase-based machine translation system, and present experimental results on the four language pairs of the shared task. First, we focus on the French-English task using multiple resources and techniques. Then we...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009